127 research outputs found
Protein microenvironments for topology analysis
Previously held under moratorium from 1st December 2016 until 1st December 2021Amino Acid Residues are often the focus of research on protein structures. However, in a folded protein, each residue finds itself in an environment that is defined
by the properties of its surrounding residues. The term microenvironment is used
herein to refer to these local ensembles. Not only do they have chemical properties but also topological properties which quantify concepts such as density,
boundaries between domains and junction complexity. These quantifications are
used to project a proteinās backbone structure into a series of scores.
The hypothesis was that these sequences of scores can be used to discover protein
domains and motifs and that they can be used to align and compare groups of
3D protein structures.
This research sought to implement a system that could efficiently compute microenvironments such that they can be applied routinely to large datasets. The
computation of the microenvironments was the most challenging aspect in terms
of performance, and the optimisations required are described.
Methods of scoring microenvironments were developed to enable the extraction
of domain and motif data without 3D alignment. The problem of allosteric site
detection was addressed with a classifier that gave high rates of allosteric site
detection.
Overall, this work describes the development of a system that scales well with
increasing dataset sizes. It builds on existing techniques, in order to automatically detect the boundaries of domains and demonstrates the ability to process
large datasets by application to allosteric site detection, a problem that has not
previously been adequately solved.Amino Acid Residues are often the focus of research on protein structures. However, in a folded protein, each residue finds itself in an environment that is defined
by the properties of its surrounding residues. The term microenvironment is used
herein to refer to these local ensembles. Not only do they have chemical properties but also topological properties which quantify concepts such as density,
boundaries between domains and junction complexity. These quantifications are
used to project a proteinās backbone structure into a series of scores.
The hypothesis was that these sequences of scores can be used to discover protein
domains and motifs and that they can be used to align and compare groups of
3D protein structures.
This research sought to implement a system that could efficiently compute microenvironments such that they can be applied routinely to large datasets. The
computation of the microenvironments was the most challenging aspect in terms
of performance, and the optimisations required are described.
Methods of scoring microenvironments were developed to enable the extraction
of domain and motif data without 3D alignment. The problem of allosteric site
detection was addressed with a classifier that gave high rates of allosteric site
detection.
Overall, this work describes the development of a system that scales well with
increasing dataset sizes. It builds on existing techniques, in order to automatically detect the boundaries of domains and demonstrates the ability to process
large datasets by application to allosteric site detection, a problem that has not
previously been adequately solved
Sharing large data collections between mobile peers
New directions in the provision of end-user computing experiences mean that we need to determine the best way to share data between small mobile computing devices. Partitioning large structures so that they can be shared efficiently provides a basis for data-intensive applications on such platforms. In conjunction with such an approach, dictionary-based compression techniques provide additional benefits and help to prolong battery life
Local pre-processing for node classification in networks : application in protein-protein interaction
Network modelling provides an increasingly popular conceptualisation in a wide range of domains, including the analysis of protein structure. Typical approaches to analysis model parameter values at nodes within the network. The spherical locality around a node provides a microenvironment that can be used to characterise an area of a network rather than a particular point within it. Microenvironments that centre on the nodes in a protein chain can be used to quantify parameters that are related to protein functionality. They also permit particular patterns of such parameters in node-centred microenvironments to be used to locate sites of particular interest. This paper evaluates an approach to index generation that seeks to rapidly construct microenvironment data. The results show that index generation performs best when the radius of microenvironments matches the granularity of the index. Results are presented to show that such microenvironments improve the utility of protein chain parameters in classifying the structural characteristics of nodes using both support vector machines and neural networks
Multi-Messenger Astronomy with Extremely Large Telescopes
The field of time-domain astrophysics has entered the era of Multi-messenger
Astronomy (MMA). One key science goal for the next decade (and beyond) will be
to characterize gravitational wave (GW) and neutrino sources using the next
generation of Extremely Large Telescopes (ELTs). These studies will have a
broad impact across astrophysics, informing our knowledge of the production and
enrichment history of the heaviest chemical elements, constrain the dense
matter equation of state, provide independent constraints on cosmology,
increase our understanding of particle acceleration in shocks and jets, and
study the lives of black holes in the universe. Future GW detectors will
greatly improve their sensitivity during the coming decade, as will
near-infrared telescopes capable of independently finding kilonovae from
neutron star mergers. However, the electromagnetic counterparts to
high-frequency (LIGO/Virgo band) GW sources will be distant and faint and thus
demand ELT capabilities for characterization. ELTs will be important and
necessary contributors to an advanced and complete multi-messenger network.Comment: White paper submitted to the Astro2020 Decadal Surve
The Carnegie Supernova Project: First Near-Infrared Hubble Diagram to z~0.7
The Carnegie Supernova Project (CSP) is designed to measure the luminosity
distance for Type Ia supernovae (SNe Ia) as a function of redshift, and to set
observational constraints on the dark energy contribution to the total energy
content of the Universe. The CSP differs from other projects to date in its
goal of providing an I-band {rest-frame} Hubble diagram. Here we present the
first results from near-infrared (NIR) observations obtained using the Magellan
Baade telescope for SNe Ia with 0.1 < z < 0.7. We combine these results with
those from the low-redshift CSP at z <0.1 (Folatelli et al. 2009). We present
light curves and an I-band Hubble diagram for this first sample of 35 SNe Ia
and we compare these data to 21 new SNe Ia at low redshift. These data support
the conclusion that the expansion of the Universe is accelerating. When
combined with independent results from baryon acoustic oscillations (Eisenstein
et al. 2005), these data yield Omega_m = 0.27 +/- 0.0 (statistical), and
Omega_DE = 0.76 +/- 0.13 (statistical) +/- 0.09 (systematic), for the matter
and dark energy densities, respectively. If we parameterize the data in terms
of an equation of state, w, assume a flat geometry, and combine with baryon
acoustic oscillations, we find that w = -1.05 +/- 0.13 (statistical) +/- 0.09
(systematic). The largest source of systematic uncertainty on w arises from
uncertainties in the photometric calibration, signaling the importance of
securing more accurate photometric calibrations for future supernova cosmology
programs. Finally, we conclude that either the dust affecting the luminosities
of SNe Ia has a different extinction law (R_V = 1.8) than that in the Milky Way
(where R_V = 3.1), or that there is an additional intrinsic color term with
luminosity for SNe Ia independent of the decline rate.Comment: 44 pages, 23 figures, 9 tables; Accepted for publication in the
Astrophysical Journa
Genetic associations of protein-coding variants in human disease.
Genome-wide association studies (GWAS) have identified thousands of genetic variants linked to the risk of human disease. However, GWAS have so far remained largely underpowered in relation to identifying associations in the rare and low-frequency allelic spectrum and have lacked the resolution to trace causal mechanisms to underlying genes1. Here we combined whole-exome sequencing in 392,814 UK Biobank participants with imputed genotypes from 260,405 FinnGen participants (653,219 total individuals) to conduct association meta-analyses for 744 disease endpoints across the protein-coding allelic frequency spectrum, bridging the gap between common and rare variant studies. We identified 975 associations, with more than one-third being previously unreported. We demonstrate population-level relevance for mutations previously ascribed to causing single-gene disorders, map GWAS associations to likely causal genes, explain disease mechanisms, and systematically relate disease associations to levels of 117 biomarkers and clinical-stage drug targets. Combining sequencing and genotyping in two population biobanks enabled us to benefit from increased power to detect and explain disease associations, validate findings through replication and propose medical actionability for rare genetic variants. Our study provides a compendium of protein-coding variant associations for future insights into disease biology and drug discovery
- ā¦